66 research outputs found

    A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints

    Get PDF
    We propose a new algorithm to solve optimization problems of the form minf(X)\min f(X) for a smooth function ff under the constraints that XX is positive semidefinite and the diagonal blocks of XX are small identity matrices. Such problems often arise as the result of relaxing a rank constraint (lifting). In particular, many estimation tasks involving phases, rotations, orthonormal bases or permutations fit in this framework, and so do certain relaxations of combinatorial problems such as Max-Cut. The proposed algorithm exploits the facts that (1) such formulations admit low-rank solutions, and (2) their rank-restricted versions are smooth optimization problems on a Riemannian manifold. Combining insights from both the Riemannian and the convex geometries of the problem, we characterize when second-order critical points of the smooth problem reveal KKT points of the semidefinite problem. We compare against state of the art, mature software and find that, on certain interesting problem instances, what we call the staircase method is orders of magnitude faster, is more accurate and scales better. Code is available.Comment: 37 pages, 3 figure

    Near-optimal bounds for phase synchronization

    Full text link
    The problem of phase synchronization is to estimate the phases (angles) of a complex unit-modulus vector zz from their noisy pairwise relative measurements C=zz+σWC = zz^* + \sigma W, where WW is a complex-valued Gaussian random matrix. The maximum likelihood estimator (MLE) is a solution to a unit-modulus constrained quadratic programming problem, which is nonconvex. Existing works have proposed polynomial-time algorithms such as a semidefinite relaxation (SDP) approach or the generalized power method (GPM) to solve it. Numerical experiments suggest both of these methods succeed with high probability for σ\sigma up to O~(n1/2)\tilde{\mathcal{O}}(n^{1/2}), yet, existing analyses only confirm this observation for σ\sigma up to O(n1/4)\mathcal{O}(n^{1/4}). In this paper, we bridge the gap, by proving SDP is tight for σ=O(n/logn)\sigma = \mathcal{O}(\sqrt{n /\log n}), and GPM converges to the global optimum under the same regime. Moreover, we establish a linear convergence rate for GPM, and derive a tighter \ell_\infty bound for the MLE. A novel technique we develop in this paper is to track (theoretically) nn closely related sequences of iterates, in addition to the sequence of iterates GPM actually produces. As a by-product, we obtain an \ell_\infty perturbation bound for leading eigenvectors. Our result also confirms intuitions that use techniques from statistical mechanics.Comment: 34 pages, 1 figur

    Concentration of the Kirchhoff index for Erdos-Renyi graphs

    Full text link
    Given an undirected graph, the resistance distance between two nodes is the resistance one would measure between these two nodes in an electrical network if edges were resistors. Summing these distances over all pairs of nodes yields the so-called Kirchhoff index of the graph, which measures its overall connectivity. In this work, we consider Erdos-Renyi random graphs. Since the graphs are random, their Kirchhoff indices are random variables. We give formulas for the expected value of the Kirchhoff index and show it concentrates around its expectation. We achieve this by studying the trace of the pseudoinverse of the Laplacian of Erdos-Renyi graphs. For synchronization (a class of estimation problems on graphs) our results imply that acquiring pairwise measurements uniformly at random is a good strategy, even if only a vanishing proportion of the measurements can be acquired

    Computational Complexity versus Statistical Performance on Sparse Recovery Problems

    Get PDF
    We show that several classical quantities controlling compressed sensing performance directly match classical parameters controlling algorithmic complexity. We first describe linearly convergent restart schemes on first-order methods solving a broad range of compressed sensing problems, where sharpness at the optimum controls convergence speed. We show that for sparse recovery problems, this sharpness can be written as a condition number, given by the ratio between true signal sparsity and the largest signal size that can be recovered by the observation matrix. In a similar vein, Renegar's condition number is a data-driven complexity measure for convex programs, generalizing classical condition numbers for linear systems. We show that for a broad class of compressed sensing problems, the worst case value of this algorithmic complexity measure taken over all signals matches the restricted singular value of the observation matrix which controls robust recovery performance. Overall, this means in both cases that, in compressed sensing problems, a single parameter directly controls both computational complexity and recovery performance. Numerical experiments illustrate these points using several classical algorithms.Comment: Final version, to appear in information and Inferenc

    Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices

    Full text link
    Trust-region methods (TR) can converge quadratically to minima where the Hessian is positive definite. However, if the minima are not isolated, then the Hessian there cannot be positive definite. The weaker Polyak\unicode{x2013}{\L}ojasiewicz (P{\L}) condition is compatible with non-isolated minima, and it is enough for many algorithms to preserve good local behavior. Yet, TR with an exact\textit{exact} subproblem solver lacks even basic features such as a capture theorem under P{\L}. In practice, a popular inexact\textit{inexact} subproblem solver is the truncated conjugate gradient method (tCG). Empirically, TR-tCG exhibits super-linear convergence under P{\L}. We confirm this theoretically. The main mathematical obstacle is that, under P{\L}, at points arbitrarily close to minima, the Hessian has vanishingly small, possibly negative eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet, the core theory underlying tCG is that of CG, which assumes a positive definite operator. Accordingly, we develop new tools to analyze the dynamics of CG in the presence of small eigenvalues of any sign, for the regime of interest to TR-tCG
    corecore